-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limited Process Pools #791
Conversation
I suggest using the A link for the lazy: https://github.com/sindresorhus/ava/pull/791/files?w=1 |
@@ -2,3 +2,4 @@ node_modules | |||
.nyc_output | |||
coverage | |||
bench/.results | |||
.idea |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a newline at the end of the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't add this. It should be in your own global gitignore, not here.
It should use the pool but with a concurrency of I wonder if, rather than duplicating the code, we can use infinite concurrency when the pool limit is not set, and still wait for all stats unless there is limited concurrency? |
@novemberborn I "forked" the code because of the logic that initializes each test looking for a |
@novemberborn also concurrency of 1 makes more sense than me whining that serialMap doesn't allow concurrency :| I was so deep in the weeds I was like "I can't guarantee proper execution order" not realizing who cares as long as they ran one at a time |
Yea that makes sense. I'm proposing adding a logic branch for when concurrency is restricted, which I realize may be worse than duplicating code. |
If we either de-scope exclusive tests/ The more I think about previous suggestions of static analysis for |
Yea maybe we can live with the duplication until we sort out |
}); | ||
}); | ||
|
||
return new Promise(function (resolve) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to reduce the nesting below here ⬇️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll tinker and see what I can do
Should we have a better default than Infinity? From experience, sometimes it's more efficient with 10 in concurrency than Infinity, since systems are bad at handling that much concurrency and get overloaded. We should maybe do some test runs and find some magic number as default. |
@sindresorhus I'm more familiar with multi-process in Python and a quick check shows that Celery likes to default to the number of CPUs. |
@dcousineau Last time I did concurrency profiling, I found the following to be optimal: Math.max((os.cpus().length || 1) * 2, 2) |
Renamed the CLI flag to `--concurrency, -c` as per @sindresorhus suggestion. Marked it as experimental instead of beta. Updated the new concurrency pool runner to force pool size to 1 when `--serial` flag is set as per @novemberborn suggestion. Reduced the nesting complexity of the concurrency pool code and broke up portions of the logic to ensure readability.
@sindresorhus I'm cool with that. As a part of my most recent update I noticed the conflicts and will have a resolution pushed soon |
@@ -224,3 +236,86 @@ Api.prototype._run = function (files, _options) { | |||
return runStatus; | |||
}); | |||
}; | |||
|
|||
Api._blankResults = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be a getter that returns the object. It might be an edge case, but if the consumer modifies the object, later results will also be modified. I might be missing something though and this might be a non-issue, but thought I would bring it up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everywhere that returns this object structure (but with real data) forms it themselves. On error we need to return an empty object so the runStatus.processResults(results);
doesn't choke on an undefined so ideally no consumer gets the blank results structure. I only pulled it into its own variable because there are 3 failure cases that need to send it up stream.
Writing this I realized I can simplify down to 2 and will be pushing a change to simplify as well as changing it to a getter Just In Case™
' --watch, -w Re-run tests when tests and source files change', | ||
' --source, -S Pattern to match source files so tests can be re-run (Can be repeated)', | ||
' --timeout, -T Set global timeout', | ||
' --concurrency, -c Maximum number of child processes (EXPERIMENTAL)', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use two spaces to separate the flag and description columns
Math.max((os.cpus().length || 1) * 2, 2) Can be simplified to just: (os.cpus().length || 1) * 2 |
' --watch, -w Re-run tests when tests and source files change', | ||
' --source, -S Pattern to match source files so tests can be re-run (Can be repeated)', | ||
' --timeout, -T Set global timeout', | ||
' --concurrency, -c Maximum number of child processes (EXPERIMENTAL)', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this?
Maximum number of test files running at the same time (EXPERIMENTAL)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also sync this output into the readme? Just copy paste node cli.js --help
.
@@ -132,21 +132,24 @@ AVA comes with an intelligent watch mode. [Learn more in its recipe](docs/recipe | |||
```console | |||
$ ava --help | |||
|
|||
Futuristic test runner 🚀 | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intentionally left out as it just duplicates our readme and repo description.
Tests are failing |
@sindresorhus @novemberborn Sorry about the delay, made the updates. |
I found an issue with the The updated and new tests can be found here since the diff tab isn't showing them: https://github.com/dcousineau/ava/blob/process-pool/test/api.js#L920-L967 |
@@ -116,6 +116,19 @@ Api.prototype._run = function (files, _options) { | |||
self.precompiler = new CachingPrecompiler(cacheDir, self.options.babelConfig); | |||
self.fileCount = files.length; | |||
|
|||
var overwatch; | |||
if (this.options.concurrency > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should still use _runNoPool
if concurrency is less than the fileCount.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Normally I would agree but with the current behavior, no matter what if the --concurrency
flag is used the code behind _runLimitedPool
is used. I vote we keep this as-is until we decide to drop _runNoPool
entirely so that we can deterministically say if you use the concurrency flag you will use the new process pool code.
If enough people object to my opinion, however, I will change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But if I add a concurrency
option to package.json
, I want to be able to re-enable .only
behavior. By doing
ava [files ...]
Where the number of files is less than concurrency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm similarly worried about people using the --concurrency
flag and seeing different behaviors based on their filename glob patterns. It's much simpler to convey in documentation "If the concurrency flag/option is present, automatic .only behavior is not supported" than a list of rules.
Additionally, a user manually running ava with a deliberate subset of tests is likely only doing so to run a few specific tests because they internally do not use .only
(like my team) which actually cycles back up to my argument of nixing .only
entirely since with the --watch
flag and good file name globbing, there is no real need for .only
.
GH isn't letting me comment directly on https://github.com/dcousineau/ava/blob/process-pool/test/api.js#L920 Does this test cover the scenario where we have more files than concurrency? We would need tests for:
|
Similar here - https://github.com/dcousineau/ava/blob/process-pool/test/api.js#L769 Do we need a test that verifies the test counts are added up correctly across batched runs? |
}; | ||
test.run(options) | ||
.then(resolve) | ||
.catch(reject); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.then(resolve, reject)
. resolve
won't throw, so we can save the extra promise allocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meh, I prefer it the existing way. Two argument then
is considered an anti pattern now.
@sindresorhus @jamestalmage @vdemedes what shall we do about the |
@novemberborn looking over the older code I see how I can move it above the |
@jamestalmage Dropping the concurrency pool to 1 and re-running the tests ensured they still pass. I'm checking for matches after all test files have run so it shouldn't be an issue but I can circle back and come up with a few more tests |
Let's kill If you can "circle back" with a few more tests, that would be great. We want to guard against regressions. |
…an concurrency counts
LGTM |
🎉 Woo! Thanks for your perseverance on this @dcousineau :) |
Well done! |
Great work @dcousineau! By the way I sat in front of you at ManhattanJS yesterday :p |
@sotojuan Ha! Next time grab me and say hi! (I'm at all of BoroJS meetups obviously :P). Sidebar: have you considered reaching out for EmpireJS OSS Night? |
@dcousineau If I am still in NYC by then I'll go. Thanks! |
Functionality to address #718.
Usage
RunningRunningava --pool-size=5
ava --concurrency=5
will switch behavior to the limited process pool code (otherwise existing 'legacy' code will always run).IfIf--serial
is defined the limited pool code is not run.--serial
is specified with a concurrency size, the concurrency size will be overridden and forced to 1.Notes
I encapsulated existing functionality and left it default. We cannot do the exclusive/
.only
with the limited pools as it requires us to have loaded and understand the entire test suite to make the decision about whether there is exclusivity to respect or not.As a beta feature using the process pool limit could come with the caveat that any features related to
.only
are disabled/ignored. For our use case the pool limiting is being used to keep memory usage under circleci's limits and ideally no.only
tests make it into source control.Easy Wins
The watcher worked right out of the box
Things that make me nervousThe error handling, particularly aroundCouldn\'t find any matching tests
, feels a bit like I'm stumbling in the dark, they work and I'm not 100% confident I know why.Tests
Testing was done initially by wrapping almost all
api.js
tests in a generator, and running them twice: once against the old behavior and once against the new behaviorAs a result 3 tests are currently failing. This is because theres no real way to test exclusivity/deal with.only
. These tests cannot pass until the fate of.only
in relation to process pools is decided.Tests related to
.only
functionality are not run against the new behavior.Personal Opinions
While ava could potentially statically analyze all the test files before entering in the process pool to look for exclusivity I'm concerned this is wasted effort that could be better spent deprecating .only and improving the filename globs.
Alternatively .only could be scoped to be intended to only apply to the current running file (and not to the whole testsuite) which would eliminate any need for static analysis and if a developer wanted to run only a single file they could simply use filename glob matching on a case by case basis.